{
  "nbformat": 4,
  "nbformat_minor": 5,
  "metadata": {},
  "cells": [
    {
      "metadata": {},
      "source": [
        "<td>\n",
        "   <a target=\"_blank\" href=\"https://labelbox.com\" ><img src=\"https://labelbox.com/blog/content/images/2021/02/logo-v4.svg\" width=256/></a>\n",
        "</td>"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "<td>\n",
        "<a href=\"https://colab.research.google.com/github/Labelbox/labelbox-python/blob/master/examples/annotation_import/audio.ipynb\" target=\"_blank\"><img\n",
        "src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"></a>\n",
        "</td>\n",
        "\n",
        "<td>\n",
        "<a href=\"https://github.com/Labelbox/labelbox-python/tree/master/examples/annotation_import/audio.ipynb\" target=\"_blank\"><img\n",
        "src=\"https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white\" alt=\"GitHub\"></a>\n",
        "</td>"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# Audio Annotation Import\n",
        "* This notebook will provide examples of each supported annotation type for audio assets, and also cover MAL and Label Import methods:\n",
        "\n",
        "Suported annotations that can be uploaded through the SDK\n",
        "\n",
        "* Classification Radio \n",
        "* Classification Checklist \n",
        "* Classification Free Text \n",
        "\n",
        "**Not** supported annotations\n",
        "\n",
        "* Bouding box\n",
        "* NER\n",
        "* Polygon \n",
        "* Point\n",
        "* Polyline \n",
        "* Segmentation Mask\n",
        "\n",
        "MAL and Label Import:\n",
        "\n",
        "* Model-assisted labeling - used to provide pre-annotated data for your labelers. This will enable a reduction in the total amount of time to properly label your assets. Model-assisted labeling does not submit the labels automatically, and will need to be reviewed by a labeler for submission.\n",
        "* Label Import - used to provide ground truth labels. These can in turn be used and compared against prediction labels, or used as benchmarks to see how your labelers are doing.\n",
        "\n"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "* For information on what types of annotations are supported per data type, refer to this documentation:\n",
        "    * https://docs.labelbox.com/docs/model-assisted-labeling#option-1-import-via-python-annotation-types-recommended"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "* Notes:\n",
        "    * Wait until the import job is complete before opening the Editor to make sure all annotations are imported properly."
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "!pip install -q 'labelbox[data]'"
      ],
      "cell_type": "code",
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\n",
            "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip available: \u001b[0m\u001b[31;49m22.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.0.1\u001b[0m\n",
            "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
          ]
        }
      ],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "# Setup"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "import labelbox as lb\n",
        "import uuid\n",
        "import labelbox.types as lb_types"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "# Replace with your API key\n",
        "Guides on [Create an API key](https://docs.labelbox.com/docs/create-an-api-key)"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# Add your api key\n",
        "API_KEY = \"\"\n",
        "client = lb.Client(api_key=API_KEY)"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "## Supported annotations for Audio"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "##### Classification free text #####\n",
        "\n",
        "text_annotation = lb_types.ClassificationAnnotation(\n",
        "    name=\"text_audio\",\n",
        "    value=lb_types.Text(answer=\"free text audio annotation\"),\n",
        ")\n",
        "\n",
        "text_annotation_ndjson = {\n",
        "    'name': 'text_audio',\n",
        "    'answer': 'free text audio annotation',\n",
        "}"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "##### Checklist Classification ####### \n",
        "\n",
        "checklist_annotation= lb_types.ClassificationAnnotation(\n",
        "  name=\"checklist_audio\",\n",
        "  value=lb_types.Checklist(\n",
        "      answer = [\n",
        "        lb_types.ClassificationAnswer(\n",
        "            name = \"first_checklist_answer\"\n",
        "        ), \n",
        "        lb_types.ClassificationAnswer(\n",
        "            name = \"second_checklist_answer\"\n",
        "        )\n",
        "      ]\n",
        "    ),\n",
        " )\n",
        "\n",
        "\n",
        "checklist_annotation_ndjson = {\n",
        "    'name': 'checklist_audio',\n",
        "    'answers': [\n",
        "        {'name': 'first_checklist_answer'},\n",
        "        {'name': 'second_checklist_answer'}\n",
        "    ],\n",
        "}"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "######## Radio Classification ######\n",
        "\n",
        "radio_annotation = lb_types.ClassificationAnnotation(\n",
        "    name=\"radio_audio\",\n",
        "    value=lb_types.Radio(answer=lb_types.ClassificationAnswer(\n",
        "        name=\"second_radio_answer\")))\n",
        "\n",
        "radio_annotation_ndjson = {\n",
        "    'name': 'radio_audio',\n",
        "    'answer': {\n",
        "        'name': 'first_radio_answer'\n",
        "    },\n",
        "}"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "## Upload Annotations - putting it all together "
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "## Step 1: Import data rows into Catalog"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# Create one Labelbox dataset\n",
        "\n",
        "global_key = \"sample-audio-1.mp3\"\n",
        "\n",
        "asset = {\n",
        "    \"row_data\": \"https://storage.googleapis.com/labelbox-datasets/audio-sample-data/sample-audio-1.mp3\",\n",
        "    \"global_key\": global_key\n",
        "}\n",
        "\n",
        "dataset = client.create_dataset(name=\"audio_annotation_import_demo_dataset\")\n",
        "task = dataset.create_data_rows([asset])\n",
        "task.wait_till_done()\n",
        "print(\"Errors:\", task.errors)\n",
        "print(\"Failed data rows: \", task.failed_data_rows)"
      ],
      "cell_type": "code",
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Errors: None\n",
            "Failed data rows:  None\n"
          ]
        }
      ],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "## Step 2: Create/select an ontology\n",
        "\n",
        "Your project should have the correct ontology setup with all the tools and classifications supported for your annotations, and the tool names and classification instructions should match the `name` fields in your annotations to ensure the correct feature schemas are matched.\n",
        "\n",
        "For example, when we create the text annotation, we provided the `name` as `text_audio`. Now, when we setup our ontology, we must ensure that the name of the tool is also `text_audio`. The same alignment must hold true for the other tools and classifications we create in our ontology."
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "ontology_builder = lb.OntologyBuilder(\n",
        "  classifications=[ \n",
        "    lb.Classification( \n",
        "      class_type=lb.Classification.Type.TEXT,\n",
        "      name=\"text_audio\"), \n",
        "    lb.Classification( \n",
        "      class_type=lb.Classification.Type.CHECKLIST,                   \n",
        "      name=\"checklist_audio\", \n",
        "      options=[\n",
        "        lb.Option(value=\"first_checklist_answer\"),\n",
        "        lb.Option(value=\"second_checklist_answer\")            \n",
        "      ]\n",
        "    ), \n",
        "    lb.Classification( \n",
        "      class_type=lb.Classification.Type.RADIO, \n",
        "      name=\"radio_audio\", \n",
        "      options=[\n",
        "        lb.Option(value=\"first_radio_answer\"),\n",
        "        lb.Option(value=\"second_radio_answer\")\n",
        "      ]\n",
        "    )\n",
        "  ]\n",
        ")\n",
        "\n",
        "ontology = client.create_ontology(\"Ontology Audio Annotations\", ontology_builder.asdict(), media_type=lb.MediaType.Audio)"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "\n",
        "## Step 3: Create a labeling project\n",
        "Connect the ontology to the labeling project"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# Create Labelbox project\n",
        "project = client.create_project(name=\"audio_project\", \n",
        "                                    media_type=lb.MediaType.Audio)\n",
        "\n",
        "# Setup your ontology \n",
        "project.setup_editor(ontology) # Connect your ontology and editor to your project"
      ],
      "cell_type": "code",
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "Default createProject behavior will soon be adjusted to prefer batch projects. Pass in `queue_mode` parameter explicitly to opt-out for the time being.\n"
          ]
        }
      ],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "## Step 4: Send a batch of data rows to the project"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# Setup Batches and Ontology\n",
        "\n",
        "# Create a batch to send to your MAL project\n",
        "batch = project.create_batch(\n",
        "  \"first-batch-audio-demo\", # Each batch in a project must have a unique name\n",
        "  global_keys=[global_key], # Paginated collection of data row objects, list of data row ids or global keys\n",
        "  priority=5 # priority between 1(Highest) - 5(lowest)\n",
        ")\n",
        "\n",
        "print(\"Batch: \", batch)"
      ],
      "cell_type": "code",
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Batch:  <Batch {\n",
            "    \"consensus_settings_json\": \"{\\\"numberOfLabels\\\":1,\\\"coveragePercentage\\\":0}\",\n",
            "    \"created_at\": \"2023-03-28 18:22:33+00:00\",\n",
            "    \"name\": \"first-batch-audio-demo\",\n",
            "    \"size\": 0,\n",
            "    \"uid\": \"81f8c9a0-cd95-11ed-872d-bb648b922b15\",\n",
            "    \"updated_at\": \"2023-03-28 18:22:33+00:00\"\n",
            "}>\n"
          ]
        }
      ],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "## Step 5: Create the annotations payload\n",
        "Create the annotations payload using the snippets of code above\n",
        "\n",
        "Labelbox support two formats for the annotations payload: NDJSON and Python Annotation types."
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "#### Python annotation\n",
        "Here we create the complete labels ndjson payload of annotations only using python annotation format. There is one annotation for each reference to an annotation that we created. "
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "label = []\n",
        "label.append(\n",
        "  lb_types.Label(\n",
        "    data=lb_types.AudioData(\n",
        "      global_key=global_key\n",
        "    ),\n",
        "    annotations=[\n",
        "      text_annotation,\n",
        "      checklist_annotation,\n",
        "      radio_annotation\n",
        "    ]\n",
        "  )\n",
        ")"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "### NDJSON annotations \n",
        "Here we create the complete label NDJSON payload of annotations only using NDJSON format. There is one annotation for each reference to an annotation that we created [above](https://colab.research.google.com/drive/1rFv-VvHUBbzFYamz6nSMRJz1mEg6Ukqq#scrollTo=3umnTd-MfI0o&line=1&uniqifier=1)."
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "label_ndjson = []\n",
        "for annotations in [text_annotation_ndjson,\n",
        "                    checklist_annotation_ndjson,\n",
        "                    radio_annotation_ndjson]:\n",
        "  annotations.update({\n",
        "      'dataRow': {\n",
        "          'globalKey': global_key\n",
        "      }\n",
        "  })\n",
        "  label_ndjson.append(annotations)"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "### Step 6: Upload annotations to a project as pre-labels or complete labels"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "#### Model Assisted Labeling (MAL)\n",
        "For the purpose of this tutorial only run one of the label_ndjosn annotation type tools at the time (NDJSON or Annotation types). Delete the previous labels before uploading labels that use the 2nd method (ndjson)"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# Upload our label using Model-Assisted Labeling\n",
        "upload_job = lb.MALPredictionImport.create_from_objects(\n",
        "    client = client, \n",
        "    project_id = project.uid, \n",
        "    name=f\"mal_job-{str(uuid.uuid4())}\", \n",
        "    predictions=label)\n",
        "\n",
        "upload_job.wait_until_done();\n",
        "print(\"Errors:\", upload_job.errors)\n",
        "print(\"Status of uploads: \", upload_job.statuses)"
      ],
      "cell_type": "code",
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Errors: []\n",
            "Status of uploads:  [{'uuid': '40aea601-206f-491c-8bcc-675205dbb351', 'dataRow': {'id': 'clfsl56ww5uyi078ldbojeqby', 'globalKey': 'sample-audio-1.mp3'}, 'status': 'SUCCESS'}, {'uuid': '51c43a0e-a54b-4ee3-94c1-7ccdebe81f98', 'dataRow': {'id': 'clfsl56ww5uyi078ldbojeqby', 'globalKey': 'sample-audio-1.mp3'}, 'status': 'SUCCESS'}, {'uuid': '380273f0-4ed4-4d0a-959a-06659f5edf88', 'dataRow': {'id': 'clfsl56ww5uyi078ldbojeqby', 'globalKey': 'sample-audio-1.mp3'}, 'status': 'SUCCESS'}]\n"
          ]
        }
      ],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "#### Label Import"
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# Upload label for this data row in project \n",
        "upload_job = lb.LabelImport.create_from_objects(\n",
        "    client = client, \n",
        "    project_id = project.uid, \n",
        "    name=\"label_import_job\"+str(uuid.uuid4()),  \n",
        "    labels=label)\n",
        "\n",
        "upload_job.wait_until_done();\n",
        "print(\"Errors:\", upload_job.errors)\n",
        "print(\"Status of uploads: \", upload_job.statuses)"
      ],
      "cell_type": "code",
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Errors: []\n",
            "Status of uploads:  [{'uuid': '1e555dd0-3916-4cfb-97be-79f9607da01a', 'dataRow': {'id': 'clfsl56ww5uyi078ldbojeqby', 'globalKey': 'sample-audio-1.mp3'}, 'status': 'SUCCESS'}, {'uuid': 'fe805388-d313-45ea-b3bc-1a7f5ffa0980', 'dataRow': {'id': 'clfsl56ww5uyi078ldbojeqby', 'globalKey': 'sample-audio-1.mp3'}, 'status': 'SUCCESS'}, {'uuid': '83cea6f0-acdf-4a2c-afff-197bac9bdb01', 'dataRow': {'id': 'clfsl56ww5uyi078ldbojeqby', 'globalKey': 'sample-audio-1.mp3'}, 'status': 'SUCCESS'}]\n"
          ]
        }
      ],
      "execution_count": null
    },
    {
      "metadata": {},
      "source": [
        "### Optional deletions for cleanup "
      ],
      "cell_type": "markdown"
    },
    {
      "metadata": {},
      "source": [
        "# project.delete()\n",
        "# dataset.delete()"
      ],
      "cell_type": "code",
      "outputs": [],
      "execution_count": null
    }
  ]
}